Corpus-Based Thesaurus Construction for Image Retrieval in Specialist Domains

نویسندگان

  • Khurshid Ahmad
  • Mariam Tariq
  • Bogdan Vrusias
  • Chris Handy
چکیده

This paper explores the use of texts that are related to an image collection, also known as collateral texts, for building thesauri in specialist domains to aid in image retrieval. Corpus linguistic and information extraction methods are used for identifying key terms and semantic relationships in specialist texts that may be used for query expansion purposes. The specialist domain context imposes certain constraints on the language used in the texts, which makes the texts computationally more tractable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Thai Ontology Construction and Maintenance System

Ontology is an essential resource to enhance the performance of Information Processing system such as information integration, document classification in taxonomies, including information retrieval and data cleaning in database system. This paper proposes three methodologies for Automatic Thai Ontology Construction and Maintenance from technical corpus, dictionary and thesaurus. For corpus base...

متن کامل

Building Thesaurus from Manual Sources and Automatic Scanned Texts

This paper describes the work done in the TIPS project about the construction of a thesaurus base. This construction is a merge from a thesaurus manually built and one automatically extracted from large text corpora. Several manually built thesaurus have been semiformatted to be merged in a consistent common base. The automatic extraction is based on both syntax and statistics. We present in th...

متن کامل

Statistical Thesaurus Construction for a Morphologically Rich Language

Corpus-based thesaurus construction for Morphologically Rich Languages (MRL) is a complex task, due to the morphological variability of MRL. In this paper we explore alternative term representations, complemented by clustering of morphological variants. We introduce a generic algorithmic scheme for thesaurus construction in MRL, and demonstrate the empirical benefit of our methodology for a Heb...

متن کامل

Evaluation of a Thesaurus-Based Query Expansion Technique

based query expansion method for information retrieval. The query expansion process assigns weights to different types of relations obtained from vocabulary structures, providing an efficient way to measure distances between different terms. This method was applied to a Portuguese juridical corpus and evaluated over the top-27 queries used in the web site of the Portuguese Attorney General's Of...

متن کامل

Toward a Pan-Chinese Thesaurus

In this paper, we propose a corpus-based approach to the construction of a Pan-Chinese lexical resource, starting out with the aim to enrich existing Chinese thesauri in the Pan-Chinese context. The resulting thesaurus is thus expected to contain not only the core senses and usages of Chinese lexical items but also usages specific to individual Chinese speech communities. We introduce the ratio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003